Autor: William Nabhan Filho

1. Importando Bibliotecas e Módulos¶

In [1]:
import os, sys, warnings

sys.path.append(os.path.abspath('..'))

from src.association_rules_utils import plot_dispersao_support_lift_interativo, gerar_regras_por_cluster, salvar_regras_apriori_xlsx

warnings.filterwarnings('ignore', category=DeprecationWarning) #Desativando avisos de DeprecationWarning

import pandas as pd

2. Preparação dos dados¶

In [2]:
data = pd.read_parquet('../data/processed/data.parquet')
In [3]:
user_id_clusters = pd.read_csv('../results/user_id_clusters.csv')
In [4]:
# Fazendo o merge do DataFrame original com os clusters
data_com_cluster = data.merge(user_id_clusters, on='user_id', how='left')

3. Gerando Regras de Associação¶

Resumo dos Perfis¶

Cluster 0: Consumidores variados, focados em laticínios e produtos frescos, com compras planejadas (possivelmente famílias).
Cluster 1: Consumidores focados em saúde, com ênfase em produtos frescos e laticínios (indivíduos preocupados com nutrição, alguns possivelmente vegetarianos, atletas e fitness).
Cluster 2: Consumidores que buscam praticidade, priorizando lanches e bebidas (jovens ou solteiros com estilo de vida casual e ativa).

In [5]:
#Gerando filtro de acordo com o perfil de cada cluster para analisar associações fortes de produtos de interesse do grupo
departments_filtros = {
    0: ['dairy eggs', 'produce', 'frozen', 'snacks'],
    1: ['dairy eggs', 'produce'],
    2: ['beverages', 'snacks'],
}

r0, r1, r2 = gerar_regras_por_cluster(data_com_cluster, coluna='aisle', departments_filtros=departments_filtros)
In [6]:
#Exportando as regras em xlsx para análise mais interativa
salvar_regras_apriori_xlsx(r0, '../results/regras_apriori_cluster_0.xlsx')
salvar_regras_apriori_xlsx(r1, '../results/regras_apriori_cluster_1.xlsx')
salvar_regras_apriori_xlsx(r2, '../results/regras_apriori_cluster_2.xlsx')

Obs.: As regras foram extraídas utilizando o algoritmo Apriori, configurado com um suporte mínimo de 0,02. Esse valor foi definido considerando o tamanho do conjunto de dados e a alta diversidade de seções de produtos (134 ao todo), o que naturalmente reduz a frequência individual de muitas combinações. Assim, o suporte de 2% permite capturar padrões relevantes e recorrentes, ao mesmo tempo em que evita incluir associações raras ou pouco representativas.

4. Analisando as Regras¶

4.1 Regras - Cluster 0¶

In [7]:
r0.describe()
Out[7]:
antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski
count 3502.000000 3502.000000 3502.000000 3502.000000 3502.000000 3502.0 3502.000000 3502.000000 3502.000000 3502.000000 3502.000000 3502.000000
mean 0.192753 0.192753 0.034286 0.289092 1.546904 1.0 0.010781 1.210000 0.422946 0.099528 0.140484 0.289092
std 0.145238 0.145238 0.021391 0.208068 0.281979 0.0 0.005747 0.301715 0.154455 0.041002 0.139805 0.075021
min 0.024318 0.024318 0.020043 0.036710 0.894700 1.0 -0.006396 0.955418 -0.145188 0.036425 -0.046663 0.151197
25% 0.073880 0.073880 0.022301 0.113933 1.354589 1.0 0.007283 1.037343 0.316007 0.070906 0.035998 0.229971
50% 0.146497 0.146497 0.027593 0.228413 1.498674 1.0 0.009691 1.095407 0.420715 0.094418 0.087097 0.279483
75% 0.294816 0.294816 0.036819 0.434331 1.701769 1.0 0.012761 1.254746 0.539496 0.118185 0.203026 0.340070
max 0.545974 0.545974 0.271770 0.862903 2.536850 1.0 0.050419 3.629868 0.862867 0.392640 0.724508 0.574003
In [8]:
#Regras com consequent support baixo e lift alto
r0[(r0['consequent support'] < 0.1) & (r0['lift'] > 1.0)].sort_values(by='lift', ascending=False)
Out[8]:
antecedents consequents antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski antecedent_departments consequent_departments
3487 (milk, packaged vegetables fruits) (packaged cheese, yogurt, fresh vegetables) 0.125394 0.068954 0.021935 0.174926 2.536850 1.0 0.013288 1.128439 0.692666 0.127221 0.113820 0.246516 [dairy eggs, produce] [dairy eggs, produce]
3495 (yogurt, fresh vegetables) (packaged cheese, milk, packaged vegetables fr... 0.157283 0.055750 0.021935 0.139460 2.501510 1.0 0.013166 1.097276 0.712269 0.114782 0.088652 0.266452 [dairy eggs, produce] [dairy eggs, produce]
3488 (milk, fresh vegetables) (packaged cheese, packaged vegetables fruits, ... 0.135697 0.065442 0.021935 0.161644 2.470033 1.0 0.013054 1.114751 0.688586 0.122400 0.102938 0.248410 [dairy eggs, produce] [dairy eggs, produce]
3496 (packaged cheese, yogurt) (milk, packaged vegetables fruits, fresh veget... 0.120520 0.074117 0.021935 0.182000 2.455590 1.0 0.013002 1.131887 0.673996 0.127009 0.116520 0.238974 [dairy eggs] [dairy eggs, produce]
3493 (packaged vegetables fruits, yogurt) (milk, packaged cheese, fresh vegetables) 0.149005 0.060051 0.021935 0.147208 2.451380 1.0 0.012987 1.102202 0.695734 0.117222 0.092725 0.256237 [dairy eggs, produce] [dairy eggs, produce]
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
215 (yogurt) (fresh herbs) 0.335031 0.071591 0.025154 0.075079 1.048712 1.0 0.001168 1.003770 0.069852 0.065939 0.003756 0.213215 [dairy eggs] [produce]
75 (fresh vegetables) (cookies cakes) 0.417956 0.072509 0.031540 0.075463 1.040735 1.0 0.001234 1.003195 0.067246 0.068726 0.003185 0.255222 [produce] [snacks]
320 (milk) (nuts seeds dried fruit) 0.307068 0.084576 0.026813 0.087319 1.032438 1.0 0.000842 1.003006 0.045342 0.073494 0.002997 0.202174 [dairy eggs] [snacks]
208 (milk) (fresh herbs) 0.307068 0.071591 0.022676 0.073846 1.031498 1.0 0.000692 1.002435 0.044068 0.063699 0.002429 0.195293 [dairy eggs] [produce]
21 (fresh vegetables) (candy chocolate) 0.417956 0.074349 0.031980 0.076515 1.029122 1.0 0.000905 1.002345 0.048619 0.069472 0.002339 0.253321 [produce] [snacks]

1237 rows × 16 columns

  • Com este filtro é possível analisar os antecedentes que impulsionam a compra dos consequentes pouco frequentes no geral.
In [9]:
#Regras com consequent support alto e lift alto
r0[(r0['consequent support'] > 0.5) & (r0['lift'] > 1.0)].sort_values(by='lift', ascending=False)
Out[9]:
antecedents consequents antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski antecedent_departments consequent_departments
3414 (packaged vegetables fruits, soy lactosefree, ... (fresh fruits) 0.025115 0.545974 0.021672 0.862903 1.580483 1.0 0.007960 3.311716 0.376744 0.039445 0.698042 0.451298 [dairy eggs, produce] [produce]
3325 (milk, packaged cheese, yogurt, fresh vegetables) (fresh fruits) 0.032902 0.545974 0.028369 0.862213 1.579220 1.0 0.010405 3.295138 0.379254 0.051532 0.696523 0.457086 [dairy eggs, produce] [produce]
3354 (milk, packaged vegetables fruits, yogurt, fre... (fresh fruits) 0.040400 0.545974 0.034828 0.862080 1.578975 1.0 0.012771 3.291949 0.382115 0.063147 0.696229 0.462935 [dairy eggs, produce] [produce]
3444 (packaged cheese, milk, packaged vegetables fr... (fresh fruits) 0.031760 0.545974 0.027347 0.861058 1.577104 1.0 0.010007 3.267743 0.377929 0.049687 0.693978 0.455574 [dairy eggs, produce] [produce]
2417 (packaged vegetables fruits, energy granola ba... (fresh fruits) 0.027838 0.545974 0.023895 0.858359 1.572160 1.0 0.008696 3.205470 0.374354 0.043453 0.688033 0.451063 [produce, snacks] [produce]
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
19 (candy chocolate) (fresh fruits) 0.074349 0.545974 0.044856 0.603315 1.105025 1.0 0.004263 1.144550 0.102677 0.077947 0.126295 0.342737 [snacks] [produce]
5 (butter) (fresh fruits) 0.095620 0.545974 0.057207 0.598269 1.095783 1.0 0.005000 1.130174 0.096652 0.097892 0.115181 0.351524 [dairy eggs] [produce]
114 (cream) (fresh fruits) 0.118395 0.545974 0.069480 0.586846 1.074859 1.0 0.004839 1.098925 0.078999 0.116794 0.090020 0.357052 [dairy eggs] [produce]
177 (frozen meals) (fresh fruits) 0.106726 0.545974 0.062365 0.584350 1.070287 1.0 0.004096 1.092325 0.073518 0.105644 0.084522 0.349288 [frozen] [produce]
187 (ice cream ice) (fresh fruits) 0.147992 0.545974 0.084808 0.573059 1.049608 1.0 0.004008 1.063439 0.055473 0.139222 0.059655 0.364196 [frozen] [produce]

225 rows × 16 columns

  • Com este filtro é possível analisar os antecedentes que impulsionam ainda mais a compra dos consequentes frequentes no geral.
In [10]:
#Regras com consequent support alto e lift baixo
r0[(r0['consequent support'] > 0.4) & (r0['lift'] < 1.0)].sort_values(by='lift', ascending=True)
Out[10]:
antecedents consequents antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski antecedent_departments consequent_departments
  • Com este filtro é possível analisar os antecedentes que enfraquecem a compra dos consequentes frequentes no geral.
In [11]:
#Regras com lift < 1.0 (associações negativas)
r0[r0['lift'] < 1.0]
Out[11]:
antecedents consequents antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski antecedent_departments consequent_departments
309 (ice cream ice) (milk) 0.147992 0.307068 0.045429 0.306971 0.999685 1.0 -0.000014 0.999860 -0.000370 0.110903 -0.000140 0.227458 [frozen] [dairy eggs]
308 (milk) (ice cream ice) 0.307068 0.147992 0.045429 0.147945 0.999685 1.0 -0.000014 0.999945 -0.000455 0.110903 -0.000055 0.227458 [dairy eggs] [frozen]
2625 (fresh fruits, soy lactosefree, fresh vegetables) (milk) 0.070553 0.307068 0.021642 0.306743 0.998943 1.0 -0.000023 0.999532 -0.001137 0.060795 -0.000468 0.188611 [dairy eggs, produce] [dairy eggs]
2636 (milk) (fresh fruits, soy lactosefree, fresh vegetables) 0.307068 0.070553 0.021642 0.070478 0.998943 1.0 -0.000023 0.999920 -0.001525 0.060795 -0.000080 0.188611 [dairy eggs] [dairy eggs, produce]
1314 (fresh fruits, soy lactosefree) (milk) 0.127204 0.307068 0.038849 0.305407 0.994591 1.0 -0.000211 0.997609 -0.006192 0.098246 -0.002397 0.215961 [dairy eggs, produce] [dairy eggs]
1319 (milk) (fresh fruits, soy lactosefree) 0.307068 0.127204 0.038849 0.126516 0.994591 1.0 -0.000211 0.999212 -0.007787 0.098246 -0.000788 0.215961 [dairy eggs] [dairy eggs, produce]
22 (milk) (candy chocolate) 0.307068 0.074349 0.022702 0.073931 0.994367 1.0 -0.000129 0.999548 -0.008108 0.063286 -0.000452 0.189634 [dairy eggs] [snacks]
23 (candy chocolate) (milk) 0.074349 0.307068 0.022702 0.305338 0.994367 1.0 -0.000129 0.997510 -0.006082 0.063286 -0.002496 0.189634 [snacks] [dairy eggs]
269 (milk) (frozen meals) 0.307068 0.106726 0.032458 0.105703 0.990422 1.0 -0.000314 0.998857 -0.013764 0.085117 -0.001144 0.204915 [dairy eggs] [frozen]
268 (frozen meals) (milk) 0.106726 0.307068 0.032458 0.304127 0.990422 1.0 -0.000314 0.995773 -0.010710 0.085117 -0.004245 0.204915 [frozen] [dairy eggs]
1525 (milk, fresh vegetables) (soy lactosefree) 0.135697 0.197795 0.026210 0.193147 0.976498 1.0 -0.000631 0.994239 -0.027092 0.085294 -0.005795 0.162828 [dairy eggs, produce] [dairy eggs]
1528 (soy lactosefree) (milk, fresh vegetables) 0.197795 0.135697 0.026210 0.132508 0.976498 1.0 -0.000631 0.996324 -0.029128 0.085294 -0.003690 0.162828 [dairy eggs] [dairy eggs, produce]
1713 (milk) (packaged vegetables fruits, soy lactosefree) 0.307068 0.086709 0.025856 0.084203 0.971108 1.0 -0.000769 0.997264 -0.041168 0.070277 -0.002743 0.191200 [dairy eggs] [dairy eggs, produce]
1712 (packaged vegetables fruits, soy lactosefree) (milk) 0.086709 0.307068 0.025856 0.298196 0.971108 1.0 -0.000769 0.987358 -0.031549 0.070277 -0.012803 0.191200 [dairy eggs, produce] [dairy eggs]
344 (other creams cheeses) (soy lactosefree) 0.116797 0.197795 0.022206 0.190127 0.961229 1.0 -0.000896 0.990531 -0.043675 0.075948 -0.009560 0.151197 [dairy eggs] [dairy eggs]
345 (soy lactosefree) (other creams cheeses) 0.197795 0.116797 0.022206 0.112268 0.961229 1.0 -0.000896 0.994899 -0.047873 0.075948 -0.005127 0.151197 [dairy eggs] [dairy eggs]
1527 (milk) (soy lactosefree, fresh vegetables) 0.307068 0.094116 0.026210 0.085354 0.906900 1.0 -0.002691 0.990420 -0.129033 0.069897 -0.009673 0.181917 [dairy eggs] [dairy eggs, produce]
1526 (soy lactosefree, fresh vegetables) (milk) 0.094116 0.307068 0.026210 0.278480 0.906900 1.0 -0.002691 0.960378 -0.101788 0.069897 -0.041256 0.181917 [dairy eggs, produce] [dairy eggs]
328 (milk) (soy lactosefree) 0.307068 0.197795 0.054341 0.176968 0.894700 1.0 -0.006396 0.974694 -0.145188 0.120618 -0.025963 0.225851 [dairy eggs] [dairy eggs]
329 (soy lactosefree) (milk) 0.197795 0.307068 0.054341 0.274734 0.894700 1.0 -0.006396 0.955418 -0.127941 0.120618 -0.046663 0.225851 [dairy eggs] [dairy eggs]
  • Com este filtro é possível obter combinações que devem ser evitadas em campanhas de marketing.
In [12]:
lift_medio_antecedents = r0.groupby('antecedents')['lift'].mean()

# Visualizando os conjuntos que, em média, impulsionam a compra de outros e levam a vendas cruzadas
lift_medio_antecedents.sort_values(ascending=False).head(20)
Out[12]:
antecedents
(fresh fruits, fresh herbs, packaged vegetables fruits)        2.008946
(milk, yogurt, fresh vegetables)                               1.940806
(fresh herbs, yogurt)                                          1.928586
(packaged cheese, milk, packaged vegetables fruits)            1.916418
(milk, packaged cheese, fresh vegetables)                      1.905736
(packaged vegetables fruits, fresh herbs)                      1.889010
(packaged cheese, yogurt, fresh vegetables)                    1.885132
(fresh fruits, frozen produce, yogurt, fresh vegetables)       1.882529
(frozen produce, yogurt, fresh vegetables)                     1.864406
(packaged vegetables fruits, yogurt, fresh vegetables)         1.861077
(packaged cheese, packaged vegetables fruits, yogurt)          1.860961
(milk, packaged vegetables fruits, yogurt)                     1.854378
(milk, packaged vegetables fruits, fresh vegetables)           1.847622
(packaged cheese, fresh fruits, packaged vegetables fruits)    1.832870
(fresh fruits, yogurt, chips pretzels, fresh vegetables)       1.829722
(fresh fruits, packaged cheese, eggs, fresh vegetables)        1.822667
(fresh fruits, eggs, yogurt, fresh vegetables)                 1.820113
(milk, packaged cheese, fresh fruits)                          1.815288
(fresh fruits, fresh herbs)                                    1.813267
(packaged cheese, packaged vegetables fruits)                  1.812818
Name: lift, dtype: float64
In [13]:
# Visualizando os conjuntos que, em média, não impulsionam a compra de outros
lift_medio_antecedents.sort_values(ascending=True).head(10)
Out[13]:
antecedents
(frozen vegan vegetarian)          1.117200
(packaged produce)                 1.137986
(candy chocolate)                  1.151076
(ice cream ice)                    1.165093
(cream, milk)                      1.196101
(frozen meals)                     1.197348
(cream)                            1.226311
(chips pretzels, ice cream ice)    1.231108
(cookies cakes)                    1.232200
(frozen pizza)                     1.240174
Name: lift, dtype: float64
In [14]:
plot_dispersao_support_lift_interativo(r0, cor = 'springgreen')

No contexto de conjuntos de itens, o suporte e o lift são invariantes em relação à ordem dos elementos. Portanto, {A} ➔ {B} e {B} ➔ {A} possuem exatamente os mesmos valores de suporte e lift. Dessa forma, manter ambos gera duplicação sem trazer informação adicional.

Como o objetivo é capturar coocorrência e não inferir uma relação causal, a representação no formato {A} ⟷ {B} reflete de maneira mais fiel o objetivo da análise, evidenciando a relação bidirecional entre os grupos de itens (antecedents e consequents).

4.2 Regras - Cluster 1¶

In [15]:
r1.describe()
Out[15]:
antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski
count 1826.000000 1826.000000 1826.000000 1826.000000 1826.000000 1826.0 1826.000000 1826.000000 1826.000000 1826.000000 1826.000000 1826.000000
mean 0.240619 0.240619 0.044122 0.327919 1.439217 1.0 0.010885 1.306554 0.396799 0.104869 0.159119 0.327919
std 0.216620 0.216620 0.040072 0.266153 0.264952 0.0 0.006970 0.611965 0.187164 0.056129 0.186508 0.108885
min 0.022348 0.022348 0.020020 0.026911 0.574715 1.0 -0.029361 0.547423 -0.678611 0.026708 -0.826742 0.143255
25% 0.086717 0.086717 0.024915 0.130367 1.248406 1.0 0.006933 1.037303 0.283273 0.064660 0.035961 0.227071
50% 0.164543 0.164543 0.031331 0.230600 1.406250 1.0 0.009644 1.096366 0.403407 0.100946 0.087896 0.313428
75% 0.320304 0.320304 0.044104 0.473518 1.595352 1.0 0.013036 1.266168 0.518933 0.126320 0.210216 0.431554
max 0.755371 0.755371 0.519183 0.956311 2.502336 1.0 0.052969 7.623897 0.908526 0.574887 0.868833 0.732902
In [16]:
#Regras com consequent support baixo e lift alto
r1[(r1['consequent support'] < 0.1) & (r1['lift'] > 1.0)].sort_values(by='lift', ascending=False)
Out[16]:
antecedents consequents antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski antecedent_departments consequent_departments
1801 (milk, packaged vegetables fruits, fresh fruits) (packaged cheese, yogurt) 0.115452 0.073494 0.021233 0.183907 2.502336 1.0 0.012747 1.135294 0.678735 0.126599 0.119171 0.236404 [dairy eggs, produce] [dairy eggs]
1811 (milk, packaged vegetables fruits) (fresh fruits, packaged cheese, yogurt) 0.131249 0.066054 0.021233 0.161773 2.449096 1.0 0.012563 1.114192 0.681077 0.120591 0.102489 0.241607 [dairy eggs, produce] [dairy eggs, produce]
1817 (packaged vegetables fruits, yogurt) (milk, packaged cheese, fresh fruits) 0.152703 0.059252 0.021233 0.139044 2.346667 1.0 0.012185 1.092679 0.677288 0.111327 0.084818 0.248694 [dairy eggs, produce] [dairy eggs, produce]
1651 (milk, fresh fruits, fresh vegetables) (packaged cheese, yogurt) 0.142155 0.073494 0.024215 0.170344 2.317792 1.0 0.013768 1.116735 0.662771 0.126494 0.104532 0.249915 [dairy eggs, produce] [dairy eggs]
1397 (milk, packaged vegetables fruits) (packaged cheese, yogurt) 0.131249 0.073494 0.022348 0.170274 2.316835 1.0 0.012702 1.116640 0.654246 0.122527 0.104457 0.237178 [dairy eggs, produce] [dairy eggs]
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
13 (fresh fruits) (cream) 0.755371 0.072538 0.057686 0.076368 1.052808 1.0 0.002893 1.004147 0.205041 0.074896 0.004130 0.435814 [produce] [dairy eggs]
0 (fresh fruits) (butter) 0.755371 0.076591 0.060020 0.079458 1.037437 1.0 0.002166 1.003115 0.147511 0.077752 0.003105 0.431554 [produce] [dairy eggs]
56 (fresh fruits) (specialty cheeses) 0.755371 0.026054 0.020327 0.026911 1.032877 1.0 0.000647 1.000880 0.130119 0.026708 0.000880 0.403558 [produce] [dairy eggs]
458 (fresh fruits) (packaged produce, packaged vegetables fruits) 0.755371 0.042710 0.032777 0.043391 1.015960 1.0 0.000515 1.000713 0.064216 0.042828 0.000712 0.405409 [produce] [produce]
368 (fresh fruits) (packaged produce, fresh vegetables) 0.755371 0.041748 0.031815 0.042118 1.008865 1.0 0.000280 1.000386 0.035920 0.041571 0.000386 0.402092 [produce] [produce]

514 rows × 16 columns

In [17]:
#Regras com consequent support alto e lift alto
r1[(r1['consequent support'] > 0.5) & (r1['lift'] > 1.0)].sort_values(by='lift', ascending=False)
Out[17]:
antecedents consequents antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski antecedent_departments consequent_departments
1593 (packaged vegetables fruits, fresh herbs, yogurt) (fresh fruits, fresh vegetables) 0.033414 0.519183 0.029173 0.873083 1.681648 1.0 0.011825 3.788449 0.419358 0.055736 0.736040 0.464637 [dairy eggs, produce] [produce]
1561 (packaged vegetables fruits, fresh herbs, soy ... (fresh fruits, fresh vegetables) 0.025986 0.519183 0.022428 0.863089 1.662397 1.0 0.008937 3.511885 0.409089 0.042904 0.715253 0.453144 [dairy eggs, produce] [produce]
1501 (milk, fresh herbs, packaged vegetables fruits) (fresh fruits, fresh vegetables) 0.027745 0.519183 0.023521 0.847764 1.632879 1.0 0.009116 3.158357 0.398645 0.044938 0.683380 0.446534 [dairy eggs, produce] [produce]
1533 (packaged cheese, packaged vegetables fruits, ... (fresh fruits, fresh vegetables) 0.032287 0.519183 0.027175 0.841678 1.621159 1.0 0.010412 3.036963 0.395941 0.051832 0.670724 0.447010 [dairy eggs, produce] [produce]
909 (fresh herbs, soy lactosefree) (fresh fruits, fresh vegetables) 0.037923 0.519183 0.031331 0.826178 1.591304 1.0 0.011642 2.766150 0.386231 0.059590 0.638487 0.443262 [dairy eggs, produce] [produce]
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
57 (specialty cheeses) (fresh fruits) 0.026054 0.755371 0.020327 0.780205 1.032877 1.0 0.000647 1.112990 0.032682 0.026708 0.101519 0.403558 [dairy eggs] [produce]
42 (fresh fruits) (fresh vegetables) 0.755371 0.666917 0.519183 0.687322 1.030596 1.0 0.015414 1.065260 0.121360 0.574887 0.061262 0.732902 [produce] [produce]
43 (fresh vegetables) (fresh fruits) 0.666917 0.755371 0.519183 0.778482 1.030596 1.0 0.015414 1.104333 0.089131 0.574887 0.094476 0.732902 [produce] [produce]
455 (packaged produce, packaged vegetables fruits) (fresh fruits) 0.042710 0.755371 0.032777 0.767426 1.015960 1.0 0.000515 1.051836 0.016410 0.042828 0.049281 0.405409 [produce] [produce]
365 (packaged produce, fresh vegetables) (fresh fruits) 0.041748 0.755371 0.031815 0.762067 1.008865 1.0 0.000280 1.028144 0.009170 0.041571 0.027373 0.402092 [produce] [produce]

353 rows × 16 columns

In [18]:
#Regras com consequent support alto e lift baixo
r1[(r1['consequent support'] > 0.4) & (r1['lift'] < 1.0)].sort_values(by='lift', ascending=True)
Out[18]:
antecedents consequents antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski antecedent_departments consequent_departments
367 (packaged produce) (fresh fruits, fresh vegetables) 0.106624 0.519183 0.031815 0.298382 0.574715 1.0 -0.023543 0.685297 -0.453047 0.053561 -0.459220 0.179830 [produce] [produce]
78 (packaged produce) (fresh vegetables) 0.106624 0.666917 0.041748 0.391543 0.587095 1.0 -0.029361 0.547423 -0.440479 0.057049 -0.826742 0.227071 [produce] [produce]
364 (packaged produce, fresh fruits) (fresh vegetables) 0.071382 0.666917 0.031815 0.445694 0.668290 1.0 -0.015791 0.600900 -0.348327 0.045032 -0.664169 0.246699 [produce] [produce]
457 (packaged produce) (fresh fruits, packaged vegetables fruits) 0.106624 0.415662 0.032777 0.307405 0.739555 1.0 -0.011543 0.843693 -0.282740 0.066958 -0.185265 0.193129 [produce] [produce]
110 (packaged produce) (packaged vegetables fruits) 0.106624 0.514914 0.042710 0.400566 0.777928 1.0 -0.012192 0.809240 -0.242158 0.073787 -0.235727 0.241756 [produce] [produce]
598 (packaged produce, packaged vegetables fruits) (fresh vegetables) 0.042710 0.666917 0.023897 0.559510 0.838949 1.0 -0.004587 0.756164 -0.167036 0.034848 -0.322465 0.297670 [produce] [produce]
50 (packaged produce) (fresh fruits) 0.106624 0.755371 0.071382 0.669478 0.886291 1.0 -0.009158 0.740131 -0.125576 0.090287 -0.351112 0.381989 [produce] [produce]
454 (packaged produce, fresh fruits) (packaged vegetables fruits) 0.071382 0.514914 0.032777 0.459171 0.891742 1.0 -0.003979 0.896930 -0.115617 0.059215 -0.114914 0.261413 [produce] [produce]
In [19]:
#Regras com lift < 1.0 (associações negativas)
r1[r1['lift'] < 1.0]
Out[19]:
antecedents consequents antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski antecedent_departments consequent_departments
1018 (fresh fruits, milk) (soy lactosefree, fresh vegetables) 0.191440 0.133309 0.025462 0.133002 0.997696 1.0 -0.000059 0.999646 -0.002848 0.085075 -0.000354 0.162001 [dairy eggs, produce] [dairy eggs, produce]
1023 (soy lactosefree, fresh vegetables) (fresh fruits, milk) 0.133309 0.191440 0.025462 0.190999 0.997696 1.0 -0.000059 0.999455 -0.002657 0.085075 -0.000545 0.162001 [dairy eggs, produce] [dairy eggs, produce]
1183 (fresh fruits, packaged vegetables fruits, soy... (milk) 0.093662 0.235060 0.021312 0.227543 0.968021 1.0 -0.000704 0.990269 -0.035167 0.069328 -0.009827 0.159105 [dairy eggs, produce] [dairy eggs]
1194 (milk) (fresh fruits, packaged vegetables fruits, soy... 0.235060 0.093662 0.021312 0.090667 0.968021 1.0 -0.000704 0.996706 -0.041399 0.069328 -0.003305 0.159105 [dairy eggs] [dairy eggs, produce]
551 (milk, fresh vegetables) (soy lactosefree) 0.167003 0.175678 0.028126 0.168416 0.958667 1.0 -0.001213 0.991268 -0.049212 0.089415 -0.008809 0.164258 [dairy eggs, produce] [dairy eggs]
554 (soy lactosefree) (milk, fresh vegetables) 0.175678 0.167003 0.028126 0.160100 0.958667 1.0 -0.001213 0.991782 -0.049704 0.089415 -0.008287 0.164258 [dairy eggs] [dairy eggs, produce]
1016 (fresh fruits, soy lactosefree, fresh vegetables) (milk) 0.113694 0.235060 0.025462 0.223952 0.952744 1.0 -0.001263 0.985686 -0.052997 0.078758 -0.014521 0.166137 [dairy eggs, produce] [dairy eggs]
1025 (milk) (fresh fruits, soy lactosefree, fresh vegetables) 0.235060 0.113694 0.025462 0.108321 0.952744 1.0 -0.001263 0.993975 -0.060893 0.078758 -0.006062 0.166137 [dairy eggs] [dairy eggs, produce]
411 (soy lactosefree) (fresh fruits, milk) 0.175678 0.191440 0.031581 0.179768 0.939031 1.0 -0.002050 0.985770 -0.073013 0.094122 -0.014435 0.172367 [dairy eggs] [dairy eggs, produce]
406 (fresh fruits, milk) (soy lactosefree) 0.191440 0.175678 0.031581 0.164967 0.939031 1.0 -0.002050 0.987173 -0.074331 0.094122 -0.012993 0.172367 [dairy eggs, produce] [dairy eggs]
410 (milk) (fresh fruits, soy lactosefree) 0.235060 0.144455 0.031581 0.134354 0.930072 1.0 -0.002374 0.988331 -0.089493 0.090768 -0.011807 0.176488 [dairy eggs] [dairy eggs, produce]
407 (fresh fruits, soy lactosefree) (milk) 0.144455 0.235060 0.031581 0.218623 0.930072 1.0 -0.002374 0.978964 -0.080781 0.090768 -0.021488 0.176488 [dairy eggs, produce] [dairy eggs]
636 (packaged vegetables fruits, soy lactosefree) (milk) 0.107614 0.235060 0.023253 0.216080 0.919255 1.0 -0.002043 0.975788 -0.089610 0.072798 -0.024812 0.157503 [dairy eggs, produce] [dairy eggs]
637 (milk) (packaged vegetables fruits, soy lactosefree) 0.235060 0.107614 0.023253 0.098925 0.919255 1.0 -0.002043 0.990357 -0.103002 0.072798 -0.009737 0.157503 [dairy eggs] [dairy eggs, produce]
552 (soy lactosefree, fresh vegetables) (milk) 0.133309 0.235060 0.028126 0.210983 0.897567 1.0 -0.003210 0.969484 -0.116355 0.082664 -0.031477 0.165318 [dairy eggs, produce] [dairy eggs]
553 (milk) (soy lactosefree, fresh vegetables) 0.235060 0.133309 0.028126 0.119654 0.897567 1.0 -0.003210 0.984489 -0.129823 0.082664 -0.015756 0.165318 [dairy eggs] [dairy eggs, produce]
459 (packaged vegetables fruits) (packaged produce, fresh fruits) 0.514914 0.071382 0.032777 0.063655 0.891742 1.0 -0.003979 0.991747 -0.200170 0.059215 -0.008322 0.261413 [produce] [produce]
454 (packaged produce, fresh fruits) (packaged vegetables fruits) 0.071382 0.514914 0.032777 0.459171 0.891742 1.0 -0.003979 0.896930 -0.115617 0.059215 -0.114914 0.261413 [produce] [produce]
50 (packaged produce) (fresh fruits) 0.106624 0.755371 0.071382 0.669478 0.886291 1.0 -0.009158 0.740131 -0.125576 0.090287 -0.351112 0.381989 [produce] [produce]
51 (fresh fruits) (packaged produce) 0.755371 0.106624 0.071382 0.094500 0.886291 1.0 -0.009158 0.986611 -0.344029 0.090287 -0.013571 0.381989 [produce] [produce]
95 (soy lactosefree) (milk) 0.175678 0.235060 0.035668 0.203033 0.863748 1.0 -0.005627 0.959813 -0.160626 0.095098 -0.041869 0.177387 [dairy eggs] [dairy eggs]
94 (milk) (soy lactosefree) 0.235060 0.175678 0.035668 0.151741 0.863748 1.0 -0.005627 0.971782 -0.170964 0.095098 -0.029038 0.177387 [dairy eggs] [dairy eggs]
598 (packaged produce, packaged vegetables fruits) (fresh vegetables) 0.042710 0.666917 0.023897 0.559510 0.838949 1.0 -0.004587 0.756164 -0.167036 0.034848 -0.322465 0.297670 [produce] [produce]
603 (fresh vegetables) (packaged produce, packaged vegetables fruits) 0.666917 0.042710 0.023897 0.035831 0.838949 1.0 -0.004587 0.992866 -0.365617 0.034848 -0.007185 0.297670 [produce] [produce]
110 (packaged produce) (packaged vegetables fruits) 0.106624 0.514914 0.042710 0.400566 0.777928 1.0 -0.012192 0.809240 -0.242158 0.073787 -0.235727 0.241756 [produce] [produce]
111 (packaged vegetables fruits) (packaged produce) 0.514914 0.106624 0.042710 0.082945 0.777928 1.0 -0.012192 0.974180 -0.370470 0.073787 -0.026504 0.241756 [produce] [produce]
456 (fresh fruits, packaged vegetables fruits) (packaged produce) 0.415662 0.106624 0.032777 0.078854 0.739555 1.0 -0.011543 0.969853 -0.376042 0.066958 -0.031084 0.193129 [produce] [produce]
457 (packaged produce) (fresh fruits, packaged vegetables fruits) 0.106624 0.415662 0.032777 0.307405 0.739555 1.0 -0.011543 0.843693 -0.282740 0.066958 -0.185265 0.193129 [produce] [produce]
369 (fresh vegetables) (packaged produce, fresh fruits) 0.666917 0.071382 0.031815 0.047704 0.668290 1.0 -0.015791 0.975136 -0.598424 0.045032 -0.025498 0.246699 [produce] [produce]
364 (packaged produce, fresh fruits) (fresh vegetables) 0.071382 0.666917 0.031815 0.445694 0.668290 1.0 -0.015791 0.600900 -0.348327 0.045032 -0.664169 0.246699 [produce] [produce]
79 (fresh vegetables) (packaged produce) 0.666917 0.106624 0.041748 0.062598 0.587095 1.0 -0.029361 0.953035 -0.678611 0.057049 -0.049280 0.227071 [produce] [produce]
78 (packaged produce) (fresh vegetables) 0.106624 0.666917 0.041748 0.391543 0.587095 1.0 -0.029361 0.547423 -0.440479 0.057049 -0.826742 0.227071 [produce] [produce]
600 (packaged vegetables fruits, fresh vegetables) (packaged produce) 0.383016 0.106624 0.023897 0.062390 0.585146 1.0 -0.016942 0.952823 -0.534689 0.051308 -0.049512 0.143255 [produce] [produce]
601 (packaged produce) (packaged vegetables fruits, fresh vegetables) 0.106624 0.383016 0.023897 0.224120 0.585146 1.0 -0.016942 0.795206 -0.442459 0.051308 -0.257536 0.143255 [produce] [produce]
366 (fresh fruits, fresh vegetables) (packaged produce) 0.519183 0.106624 0.031815 0.061278 0.574715 1.0 -0.023543 0.951694 -0.606149 0.053561 -0.050757 0.179830 [produce] [produce]
367 (packaged produce) (fresh fruits, fresh vegetables) 0.106624 0.519183 0.031815 0.298382 0.574715 1.0 -0.023543 0.685297 -0.453047 0.053561 -0.459220 0.179830 [produce] [produce]
In [20]:
lift_medio_antecedents = r1.groupby('antecedents')['lift'].mean()

# Visualizando os conjuntos que, em média, impulsionam a compra de outros e levam a vendas cruzadas
lift_medio_antecedents.sort_values(ascending=False).head(20)
Out[20]:
antecedents
(fresh fruits, packaged cheese, yogurt)                               1.794382
(packaged cheese, yogurt)                                             1.779831
(milk, packaged cheese, fresh fruits)                                 1.763200
(milk, packaged vegetables fruits, fresh fruits, fresh vegetables)    1.729690
(milk, packaged vegetables fruits, fresh vegetables)                  1.717134
(fresh fruits, packaged cheese, fresh vegetables)                     1.698417
(milk, packaged cheese)                                               1.697160
(milk, packaged cheese, fresh fruits, fresh vegetables)               1.693922
(fresh fruits, yogurt, fresh vegetables)                              1.676396
(milk, fresh fruits, fresh vegetables)                                1.673899
(packaged cheese, milk, packaged vegetables fruits, fresh fruits)     1.664883
(milk, packaged vegetables fruits, fresh fruits)                      1.652500
(packaged cheese, fresh fruits, packaged vegetables fruits)           1.645384
(fresh fruits, other creams cheeses, fresh vegetables)                1.643064
(fresh fruits, packaged cheese)                                       1.638795
(fresh fruits, packaged cheese, yogurt, fresh vegetables)             1.637533
(milk, packaged vegetables fruits)                                    1.636226
(packaged vegetables fruits, yogurt)                                  1.633937
(packaged cheese, milk, packaged vegetables fruits)                   1.632759
(milk, yogurt, fresh fruits)                                          1.632391
Name: lift, dtype: float64
In [21]:
# Visualizando os conjuntos que, em média, não impulsionam a compra de outros
lift_medio_antecedents.sort_values(ascending=True).head(10)
Out[21]:
antecedents
(packaged produce)                                                      0.691788
(packaged produce, fresh fruits)                                        0.780016
(packaged produce, packaged vegetables fruits)                          0.927454
(packaged produce, fresh vegetables)                                    1.060256
(specialty cheeses)                                                     1.112184
(cream, milk)                                                           1.132371
(packaged cheese, other creams cheeses, fresh vegetables)               1.138546
(packaged vegetables fruits, other creams cheeses, fresh vegetables)    1.154229
(cream, packaged vegetables fruits, fresh vegetables)                   1.154523
(packaged vegetables fruits, butter, fresh vegetables)                  1.156621
Name: lift, dtype: float64
In [22]:
plot_dispersao_support_lift_interativo(r1)

4.3 Regras - Cluster 2¶

In [23]:
r2.describe()
Out[23]:
antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski
count 150.000000 150.000000 150.000000 150.000000 150.000000 150.0 150.000000 150.000000 150.000000 150.000000 150.000000 150.000000
mean 0.189051 0.189051 0.034717 0.220495 1.234758 1.0 0.003690 1.042710 0.168721 0.101032 0.036156 0.220495
std 0.108536 0.108536 0.016266 0.105421 0.306060 0.0 0.006899 0.076598 0.229633 0.026528 0.066337 0.042265
min 0.045670 0.045670 0.020206 0.053398 0.808574 1.0 -0.012201 0.891749 -0.248450 0.050140 -0.121392 0.150274
25% 0.117069 0.117069 0.023917 0.142762 0.975908 1.0 -0.000558 0.996892 -0.029869 0.084260 -0.003117 0.188761
50% 0.137029 0.137029 0.028841 0.198701 1.155867 1.0 0.003925 1.026152 0.155024 0.098710 0.025485 0.216027
75% 0.231042 0.231042 0.041019 0.276537 1.476781 1.0 0.008461 1.073639 0.378187 0.112786 0.068588 0.251697
max 0.435706 0.435706 0.105093 0.535233 1.909741 1.0 0.022983 1.376164 0.619500 0.176578 0.273343 0.317040
In [24]:
#Regras com consequent support baixo e lift alto
r2[(r2['consequent support'] < 0.1) & (r2['lift'] > 1.0)].sort_values(by='lift', ascending=False)
Out[24]:
antecedents consequents antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski antecedent_departments consequent_departments
131 (chips pretzels) (water seltzer sparkling water, crackers) 0.231042 0.056556 0.024954 0.108007 1.909741 1.0 0.011887 1.057681 0.619500 0.095012 0.054536 0.274619 [snacks] [beverages, snacks]
136 (energy granola bars) (water seltzer sparkling water, chips pretzels) 0.137029 0.094260 0.024444 0.178388 1.892511 1.0 0.011528 1.102394 0.546486 0.118177 0.092883 0.218858 [snacks] [beverages, snacks]
129 (crackers) (water seltzer sparkling water, chips pretzels) 0.144397 0.094260 0.024954 0.172817 1.833407 1.0 0.011343 1.094969 0.531283 0.116771 0.086732 0.218778 [snacks] [beverages, snacks]
83 (energy granola bars) (popcorn jerky) 0.137029 0.087015 0.020206 0.147459 1.694645 1.0 0.008283 1.070899 0.474994 0.099129 0.066205 0.189837 [snacks] [snacks]
137 (chips pretzels) (water seltzer sparkling water, energy granola... 0.231042 0.062764 0.024444 0.105800 1.685685 1.0 0.009943 1.048128 0.528988 0.090749 0.045918 0.247632 [snacks] [beverages, snacks]
28 (chips pretzels) (fruit vegetable snacks) 0.231042 0.061023 0.023741 0.102755 1.683885 1.0 0.009642 1.046512 0.528163 0.088478 0.044445 0.245902 [snacks] [snacks]
71 (crackers) (popcorn jerky) 0.144397 0.087015 0.021085 0.146024 1.678150 1.0 0.008521 1.069099 0.472305 0.100251 0.064633 0.194172 [snacks] [snacks]
35 (chips pretzels) (popcorn jerky) 0.231042 0.087015 0.032446 0.140432 1.613893 1.0 0.012342 1.062145 0.494670 0.113601 0.058509 0.256655 [snacks] [snacks]
143 (chips pretzels) (water seltzer sparkling water, refrigerated) 0.231042 0.077958 0.023266 0.100700 1.291726 1.0 0.005254 1.025289 0.293699 0.081425 0.024665 0.199572 [snacks] [beverages]
135 (water seltzer sparkling water) (energy granola bars, chips pretzels) 0.435706 0.045670 0.024444 0.056103 1.228426 1.0 0.004545 1.011052 0.329527 0.053497 0.010932 0.295668 [beverages] [snacks]
142 (refrigerated) (water seltzer sparkling water, chips pretzels) 0.202219 0.094260 0.023266 0.115053 1.220597 1.0 0.004205 1.023497 0.226540 0.085157 0.022957 0.180941 [beverages] [beverages, snacks]
94 (soft drinks) (energy sports drinks) 0.269204 0.095210 0.030388 0.112882 1.185617 1.0 0.004758 1.019921 0.214228 0.090976 0.019532 0.216027 [beverages] [beverages]
148 (soft drinks) (water seltzer sparkling water, chips pretzels) 0.269204 0.094260 0.028313 0.105174 1.115784 1.0 0.002938 1.012197 0.141994 0.084479 0.012050 0.202773 [beverages] [beverages, snacks]
92 (refrigerated) (energy sports drinks) 0.202219 0.095210 0.020628 0.102009 1.071413 1.0 0.001375 1.007572 0.083549 0.074524 0.007515 0.159335 [beverages] [beverages]
111 (soft drinks) (popcorn jerky) 0.269204 0.087015 0.024778 0.092043 1.057792 1.0 0.001354 1.005539 0.074760 0.074760 0.005508 0.188402 [beverages] [snacks]
27 (chips pretzels) (energy sports drinks) 0.231042 0.095210 0.023143 0.100167 1.052073 1.0 0.001145 1.005510 0.064367 0.076352 0.005480 0.171620 [snacks] [beverages]
141 (water seltzer sparkling water) (refrigerated, chips pretzels) 0.435706 0.051579 0.023266 0.053398 1.035271 1.0 0.000793 1.001922 0.060375 0.050140 0.001918 0.252236 [beverages] [beverages, snacks]
130 (water seltzer sparkling water) (crackers, chips pretzels) 0.435706 0.056345 0.024954 0.057273 1.016474 1.0 0.000404 1.000985 0.028720 0.053424 0.000984 0.250079 [beverages] [snacks]
In [25]:
#Regras com consequent support alto e lift alto
r2[(r2['consequent support'] > 0.5) & (r2['lift'] > 1.0)].sort_values(by='lift', ascending=False)
Out[25]:
antecedents consequents antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski antecedent_departments consequent_departments
In [26]:
#Regras com consequent support alto e lift baixo
r2[(r2['consequent support'] > 0.4) & (r2['lift'] < 1.0)].sort_values(by='lift', ascending=True)
Out[26]:
antecedents consequents antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski antecedent_departments consequent_departments
63 (cookies cakes) (water seltzer sparkling water) 0.106605 0.435706 0.039146 0.367206 0.842782 1.0 -0.007303 0.891749 -0.172737 0.077800 -0.121392 0.228525 [snacks] [beverages]
17 (candy chocolate) (water seltzer sparkling water) 0.135112 0.435706 0.049627 0.367304 0.843009 1.0 -0.009242 0.891888 -0.177170 0.095219 -0.121217 0.240602 [snacks] [beverages]
97 (energy sports drinks) (water seltzer sparkling water) 0.095210 0.435706 0.035928 0.377355 0.866077 1.0 -0.005556 0.906285 -0.145959 0.072583 -0.103406 0.229907 [beverages] [beverages]
119 (refrigerated) (water seltzer sparkling water) 0.202219 0.435706 0.077958 0.385512 0.884797 1.0 -0.010150 0.918315 -0.140307 0.139219 -0.088951 0.282217 [beverages] [beverages]
104 (juice nectars) (water seltzer sparkling water) 0.135921 0.435706 0.052511 0.386337 0.886692 1.0 -0.006710 0.919550 -0.128835 0.101155 -0.087488 0.253429 [beverages] [beverages]
123 (soft drinks) (water seltzer sparkling water) 0.269204 0.435706 0.105093 0.390384 0.895980 1.0 -0.012201 0.925655 -0.137085 0.175208 -0.080317 0.315793 [beverages] [beverages]
78 (crackers) (water seltzer sparkling water) 0.144397 0.435706 0.056556 0.391670 0.898931 1.0 -0.006359 0.927611 -0.116145 0.108025 -0.078038 0.260736 [snacks] [beverages]
112 (popcorn jerky) (water seltzer sparkling water) 0.087015 0.435706 0.034574 0.397332 0.911927 1.0 -0.003339 0.936326 -0.095664 0.070826 -0.068004 0.238342 [snacks] [beverages]
43 (chips pretzels) (water seltzer sparkling water) 0.231042 0.435706 0.094260 0.407977 0.936358 1.0 -0.006407 0.953162 -0.081212 0.164650 -0.049140 0.312158 [snacks] [beverages]
125 (tea) (water seltzer sparkling water) 0.124842 0.435706 0.051614 0.413439 0.948893 1.0 -0.002780 0.962037 -0.057975 0.101417 -0.039461 0.265950 [beverages] [beverages]
109 (nuts seeds dried fruit) (water seltzer sparkling water) 0.100591 0.435706 0.041643 0.413986 0.950149 1.0 -0.002185 0.962936 -0.055119 0.084187 -0.038491 0.254781 [snacks] [beverages]
99 (fruit vegetable snacks) (water seltzer sparkling water) 0.061023 0.435706 0.025412 0.416427 0.955751 1.0 -0.001177 0.966963 -0.046990 0.053916 -0.034166 0.237375 [snacks] [beverages]
53 (coffee) (water seltzer sparkling water) 0.117069 0.435706 0.049662 0.424215 0.973626 1.0 -0.001345 0.980043 -0.029766 0.098710 -0.020364 0.269098 [beverages] [beverages]
146 (soft drinks, chips pretzels) (water seltzer sparkling water) 0.066123 0.435706 0.028313 0.428191 0.982753 1.0 -0.000497 0.986858 -0.018446 0.059794 -0.013317 0.246587 [beverages, snacks] [beverages]
In [27]:
#Regras com lift < 1.0 (associações negativas)
r2[r2['lift'] < 1.0]
Out[27]:
antecedents consequents antecedent support consequent support support confidence lift representativity leverage conviction zhangs_metric jaccard certainty kulczynski antecedent_departments consequent_departments
106 (nuts seeds dried fruit) (soft drinks) 0.100591 0.269204 0.026871 0.267133 0.992308 1.0 -0.000208 0.997174 -0.008545 0.078359 -0.002834 0.183475 [snacks] [beverages]
107 (soft drinks) (nuts seeds dried fruit) 0.269204 0.100591 0.026871 0.099817 0.992308 1.0 -0.000208 0.999140 -0.010496 0.078359 -0.000860 0.183475 [beverages] [snacks]
146 (soft drinks, chips pretzels) (water seltzer sparkling water) 0.066123 0.435706 0.028313 0.428191 0.982753 1.0 -0.000497 0.986858 -0.018446 0.059794 -0.013317 0.246587 [beverages, snacks] [beverages]
147 (water seltzer sparkling water) (soft drinks, chips pretzels) 0.435706 0.066123 0.028313 0.064982 0.982753 1.0 -0.000497 0.998780 -0.030163 0.059794 -0.001221 0.246587 [beverages] [beverages, snacks]
52 (water seltzer sparkling water) (coffee) 0.435706 0.117069 0.049662 0.113981 0.973626 1.0 -0.001345 0.996515 -0.045805 0.098710 -0.003497 0.269098 [beverages] [beverages]
53 (coffee) (water seltzer sparkling water) 0.117069 0.435706 0.049662 0.424215 0.973626 1.0 -0.001345 0.980043 -0.029766 0.098710 -0.020364 0.269098 [beverages] [beverages]
59 (cookies cakes) (refrigerated) 0.106605 0.202219 0.020980 0.196800 0.973199 1.0 -0.000578 0.993252 -0.029903 0.072886 -0.006793 0.150274 [snacks] [beverages]
58 (refrigerated) (cookies cakes) 0.202219 0.106605 0.020980 0.103748 0.973199 1.0 -0.000578 0.996812 -0.033367 0.072886 -0.003198 0.150274 [beverages] [snacks]
99 (fruit vegetable snacks) (water seltzer sparkling water) 0.061023 0.435706 0.025412 0.416427 0.955751 1.0 -0.001177 0.966963 -0.046990 0.053916 -0.034166 0.237375 [snacks] [beverages]
98 (water seltzer sparkling water) (fruit vegetable snacks) 0.435706 0.061023 0.025412 0.058323 0.955751 1.0 -0.001177 0.997133 -0.075825 0.053916 -0.002876 0.237375 [beverages] [snacks]
109 (nuts seeds dried fruit) (water seltzer sparkling water) 0.100591 0.435706 0.041643 0.413986 0.950149 1.0 -0.002185 0.962936 -0.055119 0.084187 -0.038491 0.254781 [snacks] [beverages]
108 (water seltzer sparkling water) (nuts seeds dried fruit) 0.435706 0.100591 0.041643 0.095576 0.950149 1.0 -0.002185 0.994456 -0.085067 0.084187 -0.005575 0.254781 [beverages] [snacks]
11 (refrigerated) (candy chocolate) 0.202219 0.135112 0.025957 0.128359 0.950020 1.0 -0.001366 0.992253 -0.061865 0.083362 -0.007808 0.160236 [beverages] [snacks]
10 (candy chocolate) (refrigerated) 0.135112 0.202219 0.025957 0.192112 0.950020 1.0 -0.001366 0.987490 -0.057340 0.083362 -0.012669 0.160236 [snacks] [beverages]
124 (water seltzer sparkling water) (tea) 0.435706 0.124842 0.051614 0.118461 0.948893 1.0 -0.002780 0.992762 -0.087130 0.101417 -0.007290 0.265950 [beverages] [beverages]
125 (tea) (water seltzer sparkling water) 0.124842 0.435706 0.051614 0.413439 0.948893 1.0 -0.002780 0.962037 -0.057975 0.101417 -0.039461 0.265950 [beverages] [beverages]
86 (energy granola bars) (soft drinks) 0.137029 0.269204 0.034574 0.252310 0.937246 1.0 -0.002315 0.977406 -0.072001 0.093025 -0.023117 0.190370 [snacks] [beverages]
87 (soft drinks) (energy granola bars) 0.269204 0.137029 0.034574 0.128430 0.937246 1.0 -0.002315 0.990134 -0.083931 0.093025 -0.009965 0.190370 [beverages] [snacks]
42 (water seltzer sparkling water) (chips pretzels) 0.435706 0.231042 0.094260 0.216338 0.936358 1.0 -0.006407 0.981237 -0.107500 0.164650 -0.019122 0.312158 [beverages] [snacks]
43 (chips pretzels) (water seltzer sparkling water) 0.231042 0.435706 0.094260 0.407977 0.936358 1.0 -0.006407 0.953162 -0.081212 0.164650 -0.049140 0.312158 [snacks] [beverages]
113 (water seltzer sparkling water) (popcorn jerky) 0.435706 0.087015 0.034574 0.079351 0.911927 1.0 -0.003339 0.991676 -0.146139 0.070826 -0.008394 0.238342 [beverages] [snacks]
112 (popcorn jerky) (water seltzer sparkling water) 0.087015 0.435706 0.034574 0.397332 0.911927 1.0 -0.003339 0.936326 -0.095664 0.070826 -0.068004 0.238342 [snacks] [beverages]
79 (water seltzer sparkling water) (crackers) 0.435706 0.144397 0.056556 0.129803 0.898931 1.0 -0.006359 0.983229 -0.166142 0.108025 -0.017057 0.260736 [beverages] [snacks]
78 (crackers) (water seltzer sparkling water) 0.144397 0.435706 0.056556 0.391670 0.898931 1.0 -0.006359 0.927611 -0.116145 0.108025 -0.078038 0.260736 [snacks] [beverages]
123 (soft drinks) (water seltzer sparkling water) 0.269204 0.435706 0.105093 0.390384 0.895980 1.0 -0.012201 0.925655 -0.137085 0.175208 -0.080317 0.315793 [beverages] [beverages]
122 (water seltzer sparkling water) (soft drinks) 0.435706 0.269204 0.105093 0.241201 0.895980 1.0 -0.012201 0.963096 -0.170632 0.175208 -0.038318 0.315793 [beverages] [beverages]
121 (tea) (soft drinks) 0.124842 0.269204 0.030107 0.241161 0.895830 1.0 -0.003501 0.963045 -0.117287 0.082725 -0.038373 0.176499 [beverages] [beverages]
120 (soft drinks) (tea) 0.269204 0.124842 0.030107 0.111837 0.895830 1.0 -0.003501 0.985358 -0.137276 0.082725 -0.014860 0.176499 [beverages] [beverages]
104 (juice nectars) (water seltzer sparkling water) 0.135921 0.435706 0.052511 0.386337 0.886692 1.0 -0.006710 0.919550 -0.128835 0.101155 -0.087488 0.253429 [beverages] [beverages]
105 (water seltzer sparkling water) (juice nectars) 0.435706 0.135921 0.052511 0.120520 0.886692 1.0 -0.006710 0.982489 -0.184642 0.101155 -0.017824 0.253429 [beverages] [beverages]
49 (soft drinks) (coffee) 0.269204 0.117069 0.027926 0.103737 0.886117 1.0 -0.003589 0.985125 -0.149560 0.077931 -0.015100 0.171141 [beverages] [beverages]
48 (coffee) (soft drinks) 0.117069 0.269204 0.027926 0.238546 0.886117 1.0 -0.003589 0.959738 -0.127065 0.077931 -0.041951 0.171141 [beverages] [beverages]
118 (water seltzer sparkling water) (refrigerated) 0.435706 0.202219 0.077958 0.178923 0.884797 1.0 -0.010150 0.971627 -0.187477 0.139219 -0.029201 0.282217 [beverages] [beverages]
119 (refrigerated) (water seltzer sparkling water) 0.202219 0.435706 0.077958 0.385512 0.884797 1.0 -0.010150 0.918315 -0.140307 0.139219 -0.088951 0.282217 [beverages] [beverages]
97 (energy sports drinks) (water seltzer sparkling water) 0.095210 0.435706 0.035928 0.377355 0.866077 1.0 -0.005556 0.906285 -0.145959 0.072583 -0.103406 0.229907 [beverages] [beverages]
96 (water seltzer sparkling water) (energy sports drinks) 0.435706 0.095210 0.035928 0.082459 0.866077 1.0 -0.005556 0.986103 -0.215088 0.072583 -0.014093 0.229907 [beverages] [beverages]
16 (water seltzer sparkling water) (candy chocolate) 0.435706 0.135112 0.049627 0.113901 0.843009 1.0 -0.009242 0.976062 -0.248130 0.095219 -0.024525 0.240602 [beverages] [snacks]
17 (candy chocolate) (water seltzer sparkling water) 0.135112 0.435706 0.049627 0.367304 0.843009 1.0 -0.009242 0.891888 -0.177170 0.095219 -0.121217 0.240602 [snacks] [beverages]
63 (cookies cakes) (water seltzer sparkling water) 0.106605 0.435706 0.039146 0.367206 0.842782 1.0 -0.007303 0.891749 -0.172737 0.077800 -0.121392 0.228525 [snacks] [beverages]
62 (water seltzer sparkling water) (cookies cakes) 0.435706 0.106605 0.039146 0.089845 0.842782 1.0 -0.007303 0.981585 -0.248450 0.077800 -0.018760 0.228525 [beverages] [snacks]
114 (refrigerated) (soft drinks) 0.202219 0.269204 0.044017 0.217671 0.808574 1.0 -0.010421 0.934129 -0.228844 0.102987 -0.070516 0.190590 [beverages] [beverages]
115 (soft drinks) (refrigerated) 0.269204 0.202219 0.044017 0.163509 0.808574 1.0 -0.010421 0.953723 -0.244687 0.102987 -0.048522 0.190590 [beverages] [beverages]
In [28]:
lift_medio_antecedents = r2.groupby('antecedents')['lift'].mean()

# Visualizando os conjuntos que, em média, impulsionam a compra de outros e levam a vendas cruzadas
lift_medio_antecedents.sort_values(ascending=False).head(20)
Out[28]:
antecedents
(water seltzer sparkling water, crackers)               1.909741
(water seltzer sparkling water, energy granola bars)    1.685685
(water seltzer sparkling water, chips pretzels)         1.515575
(energy granola bars)                                   1.425012
(crackers)                                              1.407476
(nuts seeds dried fruit)                                1.395586
(popcorn jerky)                                         1.391282
(cookies cakes)                                         1.378499
(chips pretzels)                                        1.371520
(fruit vegetable snacks)                                1.319818
(candy chocolate)                                       1.305883
(water seltzer sparkling water, refrigerated)           1.291726
(energy granola bars, chips pretzels)                   1.228426
(tea)                                                   1.175664
(water seltzer sparkling water, soft drinks)            1.166067
(juice nectars)                                         1.163754
(coffee)                                                1.138120
(refrigerated)                                          1.098669
(energy sports drinks)                                  1.043795
(refrigerated, chips pretzels)                          1.035271
Name: lift, dtype: float64
In [29]:
# Visualizando os conjuntos que, em média, não impulsionam a compra de outros
lift_medio_antecedents.sort_values(ascending=True).head(10)
Out[29]:
antecedents
(water seltzer sparkling water)                 0.950508
(soft drinks, chips pretzels)                   0.982753
(crackers, chips pretzels)                      1.016474
(soft drinks)                                   1.018923
(refrigerated, chips pretzels)                  1.035271
(energy sports drinks)                          1.043795
(refrigerated)                                  1.098669
(coffee)                                        1.138120
(juice nectars)                                 1.163754
(water seltzer sparkling water, soft drinks)    1.166067
Name: lift, dtype: float64
In [30]:
plot_dispersao_support_lift_interativo(r2, cor = 'tomato')

4.4 Conclusão¶

Observa-se dos gráficos, que quanto maior o suporte de uma regra, menor tende a ser o seu lift, indicando que as categorias mais frequentes de produtos tendem a ser compradas em conjunto naturalmente.
Por outro lado, regras com suporte baixo e lift elevado indicam associações menos comuns, e mais significativas, apontando que as categorias associadas são compradas em conjunto de forma não aleatória.
Uma análise mais aprofundada das regras específicas de cada cluster permite identificar associações exclusivas entre categorias, e, com base nessas informações, é possível propor sistemas de recomendação mais eficientes e com melhor retorno, estimulando o consumo cruzado de categorias com forte associação positiva e evitando combinações ineficazes.

5. Regras de associação Selecionadas¶

A partir de uma análise detalhada das regras de associação geradas, foram selecionadas, para cada cluster, regras exclusivas e mais significativas com alto lift e nível de confiança acima da média a fim de maximizar o impacto das estratégias de marketing e influenciar positivamente as decisões de compra dos usuários.
Obs.: O uso de regras com alta confiança visa garantir que as recomendações representem padrões de compra consistentes, aumentando a relevância das sugestões e a probabilidade de conversão.

5.1 Regras - Cluster 0¶

Antecedents → Consequents: (support; confidence; lift)

milk, fresh fruits, fresh vegetables → yogurt, packaged cheese: (0,028; 0,283; 2,346)
fresh fruits, packaged cheese → yogurt, fresh vegetables: (0,056; 0,303; 1,927)
chips pretzels, fresh vegetables → fresh fruits, packaged vegetables fruits: (0,045; 0,446; 1,806)
milk, frozen produce → yogurt: (0,025; 0,523; 1,560)

5.2 Regras - Cluster 1¶

Antecedents → Consequents: (support; confidence; lift)

packaged cheese, yogurt → milk, fresh vegetables: (0,026; 0,350; 2,098)
milk, fresh vegetables → fresh fruits, packaged cheese: (0,049; 0,293; 1,702)
milk, fresh herbs → fresh fruits, fresh vegetables: (0,034; 0,795; 1,531)
fresh fruits, soy lactosefree → packaged vegetables fruits, fresh vegetables; (0,075; 0,541; 1,414)

5.3 Regras - Cluster 2¶

Antecedents → Consequents: (support; confidence; lift)

nuts seeds dried fruit → energy granola bars: (0,025; 0,250; 1,824)
cookies cakes → chips pretzels: (0,043; 0,399; 1,728)
crackers → chips pretzels: (0,056; 0,390; 1,689)
refrigerated → juice nectars: (0,045; 0,224; 1,646)

No próximo notebook, finalizaremos o estudo propondo estratégias de marketing e de métodos para influenciar compras de usuários nos aplicativos para cada grupo de clientes, baseando-se no seu perfil e nas suas regras de associações selecionadas, visando maximizar seus impactos.¶